289 research outputs found
Extending DBMSs with satellite databases
In this paper, we propose an extensible architecture for database engines where satellite databases are used to scale out and implement additional functionality for a centralized database engine. The architecture uses a middleware layer that offers consistent views and a single system image over a cluster of machines with database engines. One of these engines acts as a master copy while the others are read-only snapshots which we call satellites. The satellites are lightweight DBMSs used for scalability and to provide functionality difficult or expensive to implement in the main engine. Our approach also supports the dynamic creation of satellites to be able to autonomously adapt to varying loads. The paper presents the architecture, discusses the research problems it raises, and validates its feasibility with extensive experimental result
Processing SPARQL Queries Over Distributed RDF Graphs
We propose techniques for processing SPARQL queries over a large RDF graph in
a distributed environment. We adopt a "partial evaluation and assembly"
framework. Answering a SPARQL query Q is equivalent to finding subgraph matches
of the query graph Q over RDF graph G. Based on properties of subgraph matching
over a distributed graph, we introduce local partial match as partial answers
in each fragment of RDF graph G. For assembly, we propose two methods:
centralized and distributed assembly. We analyze our algorithms from both
theoretically and experimentally. Extensive experiments over both real and
benchmark RDF repositories of billions of triples confirm that our method is
superior to the state-of-the-art methods in both the system's performance and
scalability.Comment: 30 page
GSI: GPU-friendly Subgraph Isomorphism
Subgraph isomorphism is a well-known NP-hard problem that is widely used in
many applications, such as social network analysis and query over the knowledge
graph. Due to the inherent hardness, its performance is often a bottleneck in
various real-world applications. Therefore, we address this by designing an
efficient subgraph isomorphism algorithm leveraging features of GPU
architecture, such as massive parallelism and memory hierarchy. Existing
GPU-based solutions adopt a two-step output scheme, performing the same join
process twice in order to write intermediate results concurrently. They also
lack GPU architecture-aware optimizations that allow scaling to large graphs.
In this paper, we propose a GPU-friendly subgraph isomorphism algorithm, GSI.
Different from existing edge join-based GPU solutions, we propose a
Prealloc-Combine strategy based on the vertex-oriented framework, which avoids
joining-twice in existing solutions. Also, a GPU-friendly data structure
(called PCSR) is proposed to represent an edge-labeled graph. Extensive
experiments on both synthetic and real graphs show that GSI outperforms the
state-of-the-art algorithms by up to several orders of magnitude and has good
scalability with graph size scaling to hundreds of millions of edges.Comment: 15 pages, 17 figures, conferenc
- …